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ABSTRACT 

Previously established results of Craig (1976, 1979) and Craig and Labovitz (1980) demon- 
strated that Landsat data are autocorrelated and can be described by a univariate linear stochastic 
process known as an Auto-Regressive-Integrated-Moving-Average model of degree 1,0, 1 or 
ARIMA (1,0, 1). This model has two coefficients of interest for interpretation - <f>i and In 
a comparison of Landsat Thematic Mapper Simulator (TMS) data and Landsat MSS data several re- 
sults were established: 

(1) The form of the relatedness as described by this model is not dependent upon system look 
angle or pixel size. 

(2) The <t>i coefficient increases with decreasing pixel size and increasing topographic 
complexity. 

(3) Changes in topography have a greater influence upon <l>i than changes in land cover class. 

(4) The &1 seems to vary with the amount of atmospheric haze. 

These patterns of variation in <f>i and are potentially exploitable by the remote sensing 
community to yield stochastically independent sets of observations, characterize topography, and 
reduce the number of bytes needed to store remotely sensed data. 
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PRELIMINARY EVIDENCE EOR THE INFLUENCE OF PHYSIOGRAPHY 
AND SCALE UPON THE AUTOCORRELATION FUNCTION OF 
REMOTELY SENSED DATA 

INTRODUCTION 

Remotely sensed data possesses an exploitable source of information in the form of the re* 
laiionship between the digital count of a given pixel and those of its neiglibors. The intensity 
and nature of this spatial relationship can be measured by examining the covariance between 
pixels. To introduce this concept into a remote sensing context, we will first define the covariance 
of any two random variable X and Y, This is given by, 

Et(X - MxHY - 0) 

where: 

Ei'] is the expected value operator; 

is the expected value of X; 
fly is the expected value of Y. 


Tills quantity is estimated by, 



(Xj - KHYj > 7) 
n - I 


where: 


( 2 ) 


n is the number of observations; 

X and Y are the arithmetic means of X and Y respectively; 

X| and Y| are measurements of the X property and Y property on the i*h object. 

Tliere are several points to note, 

1. The quantity (2) will be positive if the paired deviations (5^}, ^|) = Xj - X; - Yj - 

Y) have the same sign and (2) will be negative if the paired deviations have opposite signs. 
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2. If Yj and 7 arc replaced by Xi and X, (2) becomes the sample variance of X, Sj. Sim- 
ilarly, S\ Is obtained by replacing Xj and X by and V. 


3. Tl\e covariance between a given X and Y are dependent on the measurement scale of the 
variables, 

» 

In order to be able to compare pairs of variables measured on different scales, the covari- 
ance may be adjusted by dividing by the product of the square roots of the two variances, 
This new quantity is the fami^var correlation coefficient, p. The estimate of p is r which is 
given by 

y (X, - %) (Y| - 7) 

1m 0, - O’ ‘ 

The range on this quantity is - I < r < I, If we once again replace Yj, Y and Sy by X|, X and 
(or vice versa), we have the ratio of the sample variance to itself, i.e. r ■ I, 


To put the above measures into a remote sensing context, let us define any given scan line 
of digital counts as a sequence {Xj} with i being an index such that the first pixel in the scan line 
is I = 1 and neighboring pixels are consecutively indexed to the end of the scan line, Let us sup- 
pose that there are n + I pixels and replace the quantities Yj, Y and in (3) by Xj+i , Xj+i 
and then we have 


'Sp t(X} - X() (Xj-n - X|4-j)] 
W (n - 0 ' 


(4) 


1 IT* 

If the sequence {Xj} satisfies certain conditions, then E(Xjl = E[X|+|] and Var (X}> = Var (Xi+i) 
[where Var (*) is the variance operator! and (4) becomes 


A [(Xj - X) (XjH - X)i 

h sj („ - 0 

‘his expression is known as the estimate of the autocorrelation at lag 1. Note that with the 
xception of Xj and X^+i every element of the sequence is X| and Xj+j at some point in th 


( 5 ) 


summation. This expression Is |herefore the correlation of the sequence j with itself In- 
cremented by one pixel or {Xj}j*V 2 . In general, we can estimate the autocorrelation at a lag jk 


(k » 01, . . . , n) by 

i(X| - a (X|.n - x)i 

'“■fr sj‘(n-k) 


( 6 ) 


The sequence (rj;}SJ.o known as the estimated autocorrelation function (acf) and a plot of k 
versus is known as the correlogram. 


Clearly, if Xj and X{^| are correlated, then X{ and are going to have a portion of their 
correlation induced by their relationships to . Therefore we need to define a quantity anal- 
ogous to the partial correlation of conventional statistics. The quantity is called the partial 
autocorrelation function (pacO and is given by the collection 0);](’s defined as 

«kk - Px,x,.,i{x,)i:j‘v 

- EKXi - E[X| IXi,i, * Xw.iL-l,2,...k - 11) (7) 

(X,+k - Etx,+t IX|,i - x,+l; L « 1,2 ,k - H)1 

where: 

E( • 1 • 1 is the conditional expectation of the random variable on the left of the “ 1 ” 
conditioned upon setting the random variables on the right of the “ I ” at some arbi- 
trary but fixed values.* 

Box and Jenkins (1970) developed a family of linear stochastic models known as the 
Autoregressive-Integrated-Moving-Average or ARIMA models which make use of these two func- 
tions. It can be shown (with considerable algebra which will not be done here) that member 
roodels of the ARIMA family generate specific patterns in the acf and pa.cf Thus given a sample 
from an unknown process, the acf and pacf can be estimated and an ARIMA model fitted. 


i Xj+jl for example may be thought of as the linear regression estimate of Xj based upon 
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Tlte degree of complexity of prt ARIMA process is expressed as the value of three parameters 
known as p, d, q, where p is the order of the auto-rogressive process; d is the complexity of a trend 
parameter; and q is the order of the moving average (p, d, q are non-negative integers). For example, 
ifd ■“q*0, than we have an Auto-Regressive process of order p, or notationally AR(p), given by> 

Xt * + *^2X^-2 • ’*■ ^p^t-p + 5 + aj 

where; 

are a set of coefficients (i * 1, 2, . . p); 

6 « _Ji2L_ 

( j - - , , , - ^p)» 

at is a normally and independently distributed random variable with mean 0 and van* 
anceoj 1NID(0, aj))» 

In other words Xt is dependent upon the p previous values of X (Xt.i . Xt. 21 . . • 1 Xt.p) plus a 
random perturbation generated at the present time or location. Letting p * d » 0, then we have 
a Moving Average process of order q, notationally MA(q) which is; 

where: 

{at} is a q length sequence of NID (0, oj) random variables. In this model Xt is de- 
pendent upon the present and q previous random perturbations. 

It has been demonstrated (Craig, 1976, 1979; Craig and Labovitz, 1980) that Landsat data 
are autocorrelated and that the autocorrelation function (acf) can be well approximated by the 
ARIMA (1,0, 1) model of Bc". and Jenkins (1970). Th?; ARIMA (1,0, 1) model is given by, 

+ at - (8) 

where; 

{5?t}is a sequence of observations indexed in time or space with each element of the se- 
quence Xt - Xt - m; 

{at} is defined as before as a series of MID (0, o*) random variables; 

(j>i and 0, are coefficients, 


Within a remote seming context the model means that the gray scale value of a given pixel 
is dependent upon the gray scale value of an adjacent pixel a random purturbation associ- 
ated with the adjacent pixel (a^.j) and a random purturbation siJedfic to the present pixel. 

While in this paper only scan lines indexed in the direction of scan are analyzed, Craig (1976) 
has found the A RIM A (I, 0, 1) model also to be applicable for sequences in the reverse di- 
rection as well as for sequences along a single element in either direction. 

It was concluded by Craig and jUbovitz (1980) that the value of varies with some still 
ill defined location “effect,” The authors hypothesized that the location effect is related to some 
combination of topography, land cover, and/or season. Tlie 6 ^ l^^tn on the other hand varied 
with the percent cloud cover. This study will examine the relative importance of two of these 
sources of variation-land cover and physiography-for the acf. A second portion of this study 
is motivated by NASA’s intention to launch satellites processing spatial resolutions less than 80 m. 
Results from this investigation have implications for data analysis, data interpretation and data 
compression that will be covered in later sections. In summary then we will investiptc^ 

(a) Is the ARIMA (1,0, 1) model appropriate for remotely sensed data possessing a spatial 
resolution < 80m? 

(b) If (a) is true, how is spatial resolution manifested in the ARIMA (1,0, 1) model? 

(c) Can we start to determine the relative magnitudes of the contribution to the location 
“effect” of topography versus land cover? 

DATA TYPES AND LOGIC OF EXPERIMENT 

Data from three sources were used in this experiment: 

(a) A Landsat 2 image, scene id 21608-1655 [nominal scene path 36, row 32 (Denver, CO,)] 
imaged on June 18, 1979, 

(b) Two data sets acquittd by the Landsat Thematic Mapper Simulator (TMSI-NSOOl — 
mounted aboard a NASA Cl 30 aircraft. 
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(1) Data acquired on the plains at the eastern fringe of Denver, CO. Tliis fllglU took 
place on June 20, 1979 starting at 1900 GMT (1 p.m, MDT) and consisted of two 
flight tines flown in a north*'South direction, each 20.8 nautical miles (nm) in length 
with latitude 39* 50'N, longitude 104* 57/Waiid latitude and 39® 24'N,lon|ltude 104® 
42'W defining the upper left comer and lower right comer respectivelj^ of the study area. 

(2) Data acquired in the Northern Rocky Mountains region of Montana. The data 
came from flight line number 3, flown on August 29, 1979 commencing at 1830 

-I 

GMT (12;30 p.m. MDT). The flight line was !9.5nm in length, flown in a south- 
to-north direction along 1 12® 42'W longitude, starting at 47® OO'N iatitude and 
ending at 47® 15'K latitude, The flight line passed over Cotter Basin, Montana 
and hereafter will be called the Cotter Basin line, Both sets of TMS data were 
flown approximately 10,000 feet (5.049km) above ground level (AGL), Since the 
instantaneous f^’^Jd-of»-view of the NSOOl is 2.5milliradians, the pixels are approxi* 
mately 7.5m (25 feet) on a side. Table 1 gives the bands for which data were col- 
lected by the TMS and MSS. 


Table 1 

Spectral bands available for MSS and TMS data 


Channel 

Slumber 

Band Width (N 

icrometers) 

TMS 

MSS 

TMS 

MSS 

1 


0.42- 0.52 


0 

4 

0.52 - 0,60 

O.SO - 0.60 

3 

5 

0.63 - 0.69 

0.60-0.70 

4 

<5 

0,76- 0,90 

0.70-0.80 

5 

7 

1,00- 1.301 

0.80-1.1 

6 


1.55 - 1.75 * 


7 


2.08- 2.35 


8 


10.4 - 12.52 



available from Denver TMS 
“Partial coverage (40*f) Denver TMS 
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Ttie paper is logically developed Into four anitlyses. Because tlie large scan angle associated 
with the TMS (iSO* from nadir) represents a reasonable potential source of variation In the auto- 
correlation, we first examine how the autocorrelation function varies with the look angle of the 
portion of the scar ime. The importance of this analysis lies In its implications for the way we 
select starting points for the TMS scan lines. For example, If sequences of pixels with different 
look an^es have different autocorrelation functions, then we must confine our scan lines to only 
one look angle class or randomize over ail look angle classes. Otherwise, the method of selecting 
the position of the sequence within the scan line Is unimportant with respect to these alternatives. 
Once a rational method of selecting TMS sequences has been determined, we will address the 
question of the differences in the form of the autocorrelation which are attributable to the scale 
at which the observations are being made (80 m for MSS versus 7.5 m, at nadir, for TMS). In 
the third analysis, we will examine how the acf varies with changes in the land cover. Finally, 
we will look at the contribution of physiography to the acf by comparing the acfs of the Denver 
* TMS data and the Cotter Basin TMS data. 

A few caveats are appropriate at this juncture. This experiment was set up as a series of 
analyses, instead of one large experiment, to make use of the available data. As is the case when 
a large design must be attacked in pieces, the confounding of some effects in other effects and 
the failure to detect interactions are possibilities. Confounding of effects arises from having 
sources of variation which covary in the experiment so that, a significant ?esuit assigned to one 
effect may actually represent a significant contribution from the confounded effect to the varia- 
tion “explained.** Interactions can only be detected when the experimental desif i is completely 
crossed or factorial, i.e., each level of each factor appears in combination with each level of all 
other factors. In describing this analysis, we will point out where confounding might be a con- 
sideration. Loss of information related to undetectable interactions is always a concern in explor- 
atory research which does not employ a factorial design. However we will not be able to address 
this problem any further here. Such problems could only be solved by executing a much larger 
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factorial design, hence the reason for the title containing the words ^'Preliminary evidence 
However, this experiment will provide useAtl information as an Input to the selection of fac^tors 
and their levels in a larger design. 

The research then is presented in the order of the experiments described above, preceded by 
a discussion of the data reduction procedure. 

EXPERIMENTAL DESIGN AND DATA REDUCTION 

For each of the experiments outlined in the previous section, an experimental design and 
randomization procedure was constructed so that results of formal statistical hypotheses could 
be directly translated Into conclusions about the questions motivating the experiment, The format 
statistical framework being used is known as analysis of variance (ANOVA), Since the techniques 
falling within the purview of analysis of variance are rather complex and very extensive, the 
reader interested in pursuing the mechanics or philosophy beyond what is presented below 1$ 
recommended to sec Fisher, 1971; Scheffe, 1959; and Dayton, 1970, 

Figure 1 illustrates the summarization procedure applied to the data. The elements of the 
population are scan lines, or portions of scan lines which represent single samples within these 
experiments. The information about the relatedness of pixels is contained in a scan line, but not 
in a usable form. Therefore, the scan line is transformed into the aef^md partial autocorrelation 
function (pacf). The elements of these two functions are analogous both in meaning and method 
of computation to the conventional correlation (Pearson product moment) and partial correlation 
statistics. While in principle n - 1 (n » the number of pixels in the sequence) values of the acf 
and pacf can be calculated, typically only the first few values are significantly different from 
zero. Craig and Labovitz (1980) have found that the first 10 values of the acf and pacf convey 
all the significant information. Since the acf and pacf are diagnostic of the appropriate ARIMA 
model (Box and Jenkins, 1970), we can u^e the first 19 distinct values from the acf and pacf 
(the first 10 values of the acf and the second through tenth values of the pacf, the first value of 
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Figure 1. Procedure for transforming information about the relatedness of pixels from 

scan lines to composite scores. 




the acf and pacf are equal), However, these values - the acf in particular - are correlated. Thus 
we can either analyze the 19 values using a MANOVA (Multiple - ANOV A) procedure or trans- 
form the 19 observations to a set of independent (orthogonal) observations using a principal com- 
ponents procedure to produce composite or factor scores. The latter procedure was chosen since 
Craig and Labovitz (1980) found the first two factors to be highly interpretable. The location 
effect dominates factor 1 , while changes in the percentage of cloud cover are related to variation 
in factor 2 scores. In short we use a procedure which transforms the information in each sample 
from the intractable form of scan lines, processing upwards of 700 pixels, to a set of five inde- 
pendent composite scores. 

LOOK ANGLE AND ITS RELATIONSHIP TO THE ACF 

A reasonable potential source of the variation between the forms of the acfs of MSS 
and TMS data is the widely differing scan angles possessed by the two systems. If the con- 
tribution to the variation from scan angle is significant, care must be taken in the manner in 
which scan lines are selected. Otherwise, look angle effects would be confounded ih, y hypoth- 
esis designed to test differences related to pixel size. 

Using Figure 2, we can examine the effects of scan angle on pixel size. Under the assump- 
tion that Landsat 2 has a nominal altitude of 916.6km, a nadir pixel of 79m, and a scan angle 
of ±5,78° about nadir (NASA 1976), it is shown in Figure 2 that the width of a pixel increases 
by 0.40 m or about 0.51 percent from the nadir pixel to either end of the scan line. On the 
other hand, the width of a TMS pixel, assuming an average altitude of 3.049km AGL, a nadir 
pixel of 7.5m, and a scan angle of ±50° about nadir, increases to 1 1.86m at the ends of a scan 
line. Thus, both the width and area of a TMS pixel increases by 58.1 percent in going from the 
center to the end of a scan line. 

We now describe an experiment set up to test for the e.xistence of a relation between look 
angle and the acf. This experimental design is a two-way factorial design with the main effects 
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MSS VERSUS TMS 


I 

{ 



II 


Figure 2. Geometry of look angle effects in TMS versus MSS (Landsat). 


being channel and look angle, Two TMS channels are included in the analysis, channel 2 (0.52 *• 
0.60/um) and channel 6 (1,55 - l.75Mm). Selection of two of these channels allows us to exam- 
ine first if an interaction between channel and look angle exists. Second, by selecting channels 
which are set well apart spectrally, we may test the cloud cover hypothesis suggested by Craig 
and Labovitz (1980), As has been previously noted, these authors found that differences in cloud 
cover were responsible for a significant p.Htern of variation in factor 2 derived from the principal 
component decomposition acf-pacf igtii matrix. Tlie type of cloudiness being captured by tliis 
measure has been hypothesized to be general atmospheric haze. Thus, in selecting channels 2 and 
6 of TMS, we have two channels which are differentially affected by atmospheric haze and back- 
scattering. It would be reasonable then to expect a significant channel effect on factor 2 which 
would be attributed to the “cloudiness” of the scene. The reader should note that we are con- 
founding cloudiness in channel, a move that is unavoidable with the data that are available. Tlie 
look angle factor is divided into four levels, these are sequences of pixels collected at scan angle 
interv'als of 50“ - 25“ , 25“ - 0“, 0“ - 25“ and 25“ - 50“ (see Figure 3). 

< 

Since there are 19 variables, the first 10 elements of the acf and elements 2 througli 10 of 
the pacf, 60 scan lines^ per channel were selected at random throughout the flight lines. Thus, 
120 scan lines were subset from the Computer Compatable Tapes (CCTs) using the VICAR pro- 
gram COPY as adapted on the Goddard IBM 360/91. One randomly chosen portion of each scan 
line, corresponding to a scan angle class, was used to calculate the acf and pacf. Since a TMS 
scan line is 700 pixels long, there are 175 pixels in each scan angle class. Figure 3 relates each 
scan angle class to its coding and its position on the scan line. Acfs and pacfs were calculated^ 
for 15 sequences in each scan angle class. A typical acf and pacf are given in Figure 4. The ex- 
ponential decline in the acf and the oscillatory behavior in the pacf are characteristic of an 


-A common rule of thumb in choosing the number of samples is that the number should be greater than 3 times the number of vari- 
ables being examined, in this case N should be greaterthan or equal to 3 x 19 58. 

^AU acfs and pacfs in this paper were calculated using a program written by Pack et al„ 1972, as implemented on the IBM 370/3033 
at the Pennsylvania State University, 
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Figure 3. The design of the look angle classes for this experiment. 


a) Act 



5 10 15 


b) Pacf 



Figure 4. Typical acf and pacf from TMS data. These functions were 
derived from a randomly chosen scan line of length 175 from look 
angle class four. 
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ARIMA (1, 0, I). The first 10 terms of acf and the 2nd through 10th elements of the pacf for 

is 

each scan line were input into the SPSS subprogram FACTOR (Kimi 1975). 

From the screen plot, in Figure 5, it was judged that the first five principal components 
contain the common portion of the reliable variation.^ These composite scores were analyzed in 
five two-way analyses of variance using the P2V prog'-am of the BMDP series (Jcnnerick and 
Sampson, 1979a). The results for the first five composite scores are given in Table 2. Clearly 
there is no effect due to look angle, Further, the “haze" effect appears as previously hypothe- 
sized in the second composite score as a significant effect due to channel. No other effects 
appear to explain significant portions of the variation in the five composite scores. 

The above analysis was repeated dividing the scan lines into 10 portions. The results were 
identical. We thus conclude that the look angle does not effect the acf or pacf and so sequences 
of pixels may be taken from any portion of the scan line. 

t. 

SCALE AND THE ACF 

Having demonstrated that any significant pattern of variation which discriminates TMS froni 
Landsat is unlikely to be due to the large differences in the scan angles of the systems, we pro- 
ceed to test the hypothesis about the relationship between pixel size and the acf. It should be 
noted from the outset that confounded within the scale effect is a system effect. However, (1) 
both systems are electro-mechanical scanners and (2) we may get some feel for the importance 
of the confounding by examining the same portion of the spectrum with each system. For ex- 
ample, if, despite using the same spectral bands we find a significant difference in the second 
composite score of the two systems, we might attribute this difference to system rather than 
scale differences. Therefore, scan lines of TMS channel 2 (0.52 - 0.60pm) and MSS channel 4 
(0,50 - 0,60pm) were used for the analysis. 

■'a common model of factor analysis divides variation between that variation which is reliable and that which is noise. Tl>c re* 
liable variation is in turn divided between variation which is shared by the random variables (common variation) and variation 
which is unique to a single variable. See Rummel, 1970 for further discussion of this model. 
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0 5 10 15 


i (component number) 

Figure 5. Screen plot generated from the eigenvalue decomposition of an acf - pacf correlation 
matrix. \\ is the value of the eigenvalue associated with the i^*i component. 

The sampling and data manipulation procedures are the same as used previously. Sixty scan 
lines each composed of 475 pixels were selected randomly from both the TMS data and a corre- 
sponding area of the MSS data. Acfs, pacfs, and composite scores were calculated as before. 

Table 3 contains the results from the one-way ANOVA’s for the first two composites scores 
(calculations performed by the BMDPl V program [Engelman, 1979]). These analyses contain 
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Table 2 

Two-way ANOVA (look angle by channel) 
for first two composite scores 


(a) FIRST COMPOSITE SCORE 

Source 

Degrees of 
Freedom 

Sum of 
Squares 

Mean 

Square 

F* 

P(F>F*) 

Angle 

3 

4,735 

1,578 

1.58 

0.197 

Channel 

1 

0,001 

0,001 

0,00 

0.974 

Angle X Channel , 

3 

2.723 

0.908 

0.91 

0.438 

Error 

112 

111.540 

0.996 



(b) SECOND COMPOSITE SCORE 

Source 

Degrees of 
Freedom 


Mean 

Square 

F* 

P(F > F*) 

Angle 

3 

3.004 

1.001 

1.42 

0.240 

Channel 

1 

34.856 

34.855 

49.57 

0.000 

Angie X Cliannel 

3 

-2,384 

0,795 

1.13 

0.340 

Error 

112 

78.754 

0.703 




only one testable source of variation, scale, which is presented at two levels - 7.5 m and 80m, 

The first composite score once again represents information in the acf, as all the terms of the acf 
load highly on the first component, Clearly, there is a significant pattern of variation across the 
first composite score. Meanwhile, there is no evidence of a “haze effect” (see composite score 
two. Table 3b) nor were any of the other three composite scores significant at any conventional 
confidence level. 

The significant variation exhibited in the first composite score is specifically developed in 
Figure 6. Tlie plot displays the means over 60 scan lines, of the first 10 terms of the acf for 
TMS-2 and MSS-4. Both mean acfs decline exponentially as is common for the acf of an ARIMA 
(1, 0, 1) process. Hosvever, the mean of each element of the TMS-2 derived data is significantly 



























Table 3 

One-way ANOVA (scale factor) for first two composite scores 
calculated from scan lines of TMS-2 and MSS-4 


(a) FIRST COMPOSITE SCORE 

Source of Variance 

Degrees of 
Freedom 


Mean 

Square 

F* 

P(F>F*) 

Scale 

1 

23.299 

23.299 

28.73 

0.000 

Error 

118 

95.702 

0.811 



(b) SECOND COMPOSITE SCORE 

Source of Variance 

Degrees of 
Freedom 

Sum of 
Squares 

Mean 

Square 

F* 

P(F>F*) 

Scale 

1 1 

0.240 

0.240 

0.24 

0.626 

Error 

■ ... 

118 

118,755 

1.006 




higher than the corresponding MSS-4 derived mean value, even when we use a.Bonfen'oni adjust* 
ment (Fishins 1971) for the individual test confidence levels so that the group of tests has an 
overall a level of 0,01, 

We may conclude on the basis of this analysis: 

(1) It appears that the ARIMA (1,0, 1) model is appropriate for the TMS data. 

(2) The effect of scale on pixel size is to be found in the acf, that is in composite score 
one. From previous research (Craig and Labovitz, 1980), we believe that significant 
variation in the acf alone will be reflected in the (f >2 coefficient only. 

(3) Tile TMS data is more highly autocorrelate d than the MSS data. Initially we believe 
this means that the TMS data possess a greater pixel redundancy than MSS. This con- 
jecture will need further examination in future research. However, two physical explan- 
ations (relative to the ground data), which can be accepted almost intuitively, would 

be supported by a hypothesis of greater redundancy generated by decreasing pixel size. 
These explanations would suggest that there are (1) a greater number of fields larger 
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Figure 6 , Mean values for the first 10 terms of the acft derived from 60 MSS-4 scan lines 

and 60 TMS-2 scan lines, 

than 7.5 m than are larger than 80 m and/or (2) that the lengths of slopes are such that 
more 7.5m pixels occur on one slope than 80m pixels. The next two sections will deal 
with the relative importance of landcover (field size differences) versus physiography 
(length of slopes) in the acf. 

(4) The confounding of system in the scale effect, if it exists, is not of great importance. 
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THE INFLUENCE OF LAND COVER ON THE ACF 

Tlie effects upon the acf of four landcover classes - urban, agriculture, rarigeland and a ran- 
domly selected *‘oiher” class - were examined in this exiieriment. The first three classes were de- 
fined on the basis of the Anderson lever I system (Anderson et ah, 1976). The scan lines selected 
were obtained through the following procedure. Aerial photography of the Denver flight lines 
was photo-interpreted to an Anderson level I classification. The authors then constrained the 
classes examined to those which covered fairly large contiguous portions of the study area - a 
minimum of 300 pixels by 50 scan lines, This requirement reduced the number of classes to the 
aforementioned three ^ urban, agriculture and rangeland. 

From among the contiguous areas of each of these classes, four areas (blocks) of each class 
were randomly selected. The blocks of the “other” class were selected by first determining the 
average size for blocks of the first three classes, dividing the flight lines up into areas of this mean 
size and randomly selecting four such areas. R_andomly chosen examples of blocks from each 
class are shown in Figure 7. Within each block, two randomly selected scan lines were chosen 
for each of two channels, TMS-2 and TMS-4. In this manner, 4 (classes) x 4 (blocks) x 2 (chan- 
nels) x 2 (scan lines) » 64 TMS scan lines were selected, 

¥ 

Two analyses were performed, The first one was the more elaborate. In this experiment, in 
addition to the TMS data, scan lines were randomly chosen from the TMS blocks averaged 
(TMSAVE) ti/ 30 m pixels and the Landsat data of the same areas. Since the lengths of Landsat 
sequences from these locations were only 50 pixels, the TMS and TMSAVE sequence were re- 
duced to this length by choosing random starting points, However, the structure of the compo- 
nents matrix of aefs and paefs derived from these scan lines was unlike other matrices of compo- 
nents in that the acf split over the first two components. Further, there was no significant effect 
in the analysis of variance of the composite scores. It was concluded that scan lines 50 pixels in 
length were insufficient to bring out the structure, since the standard errors of the terms Of the 
acf and pacf are dependent on the length of series. 
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In the second experiment, the Lariikat and TMSAVIv sequences were dropped from the 
amuysis, nic form of the experimental design used was a partially nested within a factorial de^ 
sign {Dayton, 1970), This design has a structural model given byj 

''Ikl • ^ % + “‘■’tiW «|kl 

where t 

Y|^j is the composite score for the ctl‘ component, c * 1, 2, ... » 5; 

Is the mean of the c^‘ composite score {/i**' * 0 V c); 

is the contribution of channel (k » 1,2) to variation in the c*^' composite score; 
^ is the contribution of the land cover class (j * 1 . 2, 3, 4) to variation in the c^' 
composite score; 

is the contribution of the interaction between channel and landcover class; 
is the contribution of the jb> block (i * 1. 2, 3, 4) nested within the landcover 
class; 

oe^T^jCj) ^'^^’tlribution of tlie interaction of the channel with the i^* block 
nested within the landcover class; 

Cj®j.j the error term associated with the cd' composite. 

The scheme for the design is displayed in Table 4. C'alculalions for this ratlier complicated 
design were performed by the BMDII / program {Jenrieh and Sampson, 1979b) and the results 
for tlie first two composite scores are given in Table 5. If we use tlie previously mentioned 
Bonferroni adjustment to set the overall a level for all 10 effects at Q.05, tlien the individual 
effects must be significant at the 0.05/10 » 0.005 level. 


For the first composite score only the blocks effect is significant at the designated a level. 
The landcover effect eonlributes minor, if any. variation to the pattern of autocorrelation In the 
TMS data. Thus, the variation among blocks is greater tlian the variation among landcover 
classes. In tlie analysis of the second composite score the "liaze” elTeel is once again apparent 
by the highly significant contribution to the variation of the channel. 


Table 4 

Partial nesting in a 2 x 4 factorial design 
LANDCOVER CLASS (B) 



Agriculture 

Urban 

Rangeland 

Other 

Block {D(B}) 

Al A2 A3 A4 

Ul U2 U3 U4 

Rl R2 R3 R4 

01 02 03 04 

TMS-2 
Channel (A) 

s; s5 s5 sj 

S| S| S§ S| 

cc cc cC cc 
^9 ^10 ^12 

cc cc cc cc 
^13 ^14 ^15 ^16 

TMS-4 

( cc cc cc cc 
^17 ^18 ^19 ^20 

cc cc CC CC 
^21 ^22 ^23 ^24 

cC cc cC cc 
^25 ^26 ^27 ^28 

cC cC cC cC 
®29 ^30 ®3l ®32 


is the sample from the composite score; there trc two composite scozrrf^ ^«ch stinple. 


The conclusion of this analysis is that landcover class, and by implication Held size or num> 
ber of boundaries, does not appear to be a major contributor to variation in the ^'location*’ 
effects 

THE EFFECT QF PHYSICXSRAPHY 

Changes in landcover and changes in physiography were hypothesized by Craig and Labovitz 
(1980) as the most likely “causes” of the “location” effect. Having demonstrated that landcover 
is probably not a major factor, we will examine physiography by comparing data from Denver 
versus data from Cotter Basin. The two regions are characterized by very different physiographies. 
The Denver region is on the western edge of the Great Plains. The Cotter Basin on the other 
hJind is in the Northern Rocky Mountain Region, an area considerably more rugged. 

Sixty scan lines 700 pixels long of TMS channel 2 were randomly selected from each loca- 
tion. After data reduction, five one-way ANOVA’s were calculated on the composite scores. 

Once again the a^rlysis of the first two scores are presented in Table 6. There is a significant 

* 

variation due to physiography in the first and second composite scores (physiography is not sig- 
nificant in the other three scores). Tlie pattern of variation in the first scores represents variation 
in the acf and hence the location effect of Craig and Labovitz (1980). This will be examined in 
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Table 5 

Analysis of variance (partial nesting within 2x4 factorial design) 
to test the land cover effect 


(a) FIRST COMPOSITE SCORE 

Source 

Error 

Term 

Degrees of 
Freedom 

— 

Mean 

Square 

p* 


A 

AD(B) 

1 

1.234 

1.234 

2.03 

0.180 

B 

D(B) 

5 

26,526 

8.842 

5.00 

0.018 

AB 

AD(B) 

3 

0.180 

0,060 

0.10 

0.959 

D(B) 

E 

12 

21.232 

1.769 

8.67 

0.000 

AD(B) 

E 

12 

7.294 

0.608 

2.98 

0.007 

E 
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(b) SECOND COMPOSITE SCORE 

Source 

Error 

Term 

Degrees of 
Freedom 

Sum of 
Squares 

Mean 

Square 

F* 

P(F>F*) 

A 

AD(B) 

1 

26.237 

26.237* 

49.28 

0.000 

B 

D(B) 

3 

15.575 

0.519 

0.60 

0.627 

AB 

AD(B) 

3 

3.046 

1.015 

1.91 

0.182 

D(B) 

E 

12 

10.381 

0.865 

1.80 

0.091 

AD(B) 

E 

12 

6.389 

0.532 

l.ll 

0,388 

E 


24. 






A= Channel D(B) =* Block Other terms are interactions 

B * Landcover Class E = Error 


greater detail below. It is unclear what produced the significant variation in the second score. It 
could be that the weather or the atmospheric conditions were different when the data were col- 
lected. Certainly the difference in solar angle between June (Denver) and August (Cotter Basin) 
is considerable and this in itself might be responsible for the variation. 


24 




















Table 6 

One-way ANOVA (location-physiography factor) for the first two composite scores 

calculated from scan lines of TMS-2 


(a) FIRST COMPOSITE SCORE 

SOURCE 

Degrees of 
Freedom 

Sum of 
Squares 

Mean 

Square 

F* 

P(F>F*) 

LOCATION 

1 

16.347 

16.347 

18.97 

0.000 

ERROR 

118 

102.648 

0.870 



(b) SECOND COMPOSITE SCORE 

SOURCE 

Degrees of 
Freedom 

Sum of 
Squares 

Mean 

Square 

F* 

P(F>F*) 

LOCATION 

1 

15.882 

15.882 

18.18 

0.000 

ERROR 

MS 

103.113 

0.874 




Returning to the information contained in the Hrst composite score, Figure 8 is a plot of 
the means (over 60 observations) of the first 1 0 terms of the acfs derived from the Cotter Basin 
versus the acfs derived from the Denver data. The means are significantly different for each 
term at an overall a level of 0.01, with the means of the Cotter Basin acf higher than those from 
Denver. This implies that the Cotter Basin is more autocorrelated than the Denver data. This 
result suggests that the influence of slope and hence physiographic province upon the acf is con- 
siderable. Thus, we support the previously made assertions of Craig (1979) and Craig and Labo- 
vitz (1980) about the importance of slope. It will not be tested here, but we suspect that the 
results from the acfs arise form the Denver area having shorter length slopes than the Cotter 
Basin area. 

SUMMARY AND CONCLUSIONS 

We have tried through a series of experiments to determine the basis of the location effect 
of Craig and Labovitz (1980) and the applicability of the ARIMA (1, 0, 1) to data collected at 
different spatial resolutions. Subject to the limitations outlined in the proceeding text, we have 
demonstrated: 
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Figure 8. Mean values for the first 10 terms of the acfs derived from 60 TMS-2 
scan lines over Denver and 60 TMS-2 scan lines over Cotter Basin. 

(1) There is no impact of look angle on the acf; 

(2) Changes in pixel size act to increase the autocorrelation of the data as the pixel size 
decreases, however, this change can be accommodated in the coefficient and does 
not require abandoning the ARIMA (1, 0, 1) model; 

(3) These is considerable support for the conjecture that atmospheric conditions are reflected 
on the second component which in turn is related to the coefficient; 


26 





(4) Physiography is of far greater importance than land cover in explaining the location 
effect. 

:i !i 

It is clear from the analyses that very useful information can be gUtaned from the acf and 

■i 

pacf of remotely sensed data. Since the ARIMA (1,0, 1) model is completely defined by the 
coefficients , 0i and our results and those of Craig (previously referenced) go a long way 
towards describing the behavior of the model. If is indeed largely controlled by physiography 
(holding pixel size constant), then this information can be fairly easily exploited. Since there are 
only 20 physiographic regions and subregions in the U.S. (Fenneman, 1938), will take on a 
limited number of values.^ Thus the value of (f>i may readily be used to characterize the terrain 
being observed. Further having obtained an estimate of (and 0j) the data may be filtered by 
the model to yield the underlying stochastically independent process. Craig (in personal commu* 
nication) has shown that the variance of a scan line and by implication the variance-covariance 
matrix is vastly inflated (approximately an order of magnitude depending on the values of and 
0^) by the presence of an autocorrelation in the data. The filtering of the data is likely to en- 
hance our ability to classify using remotely sensed data. Furthermore, since the variance is re- 
lated to the information content of the data which in turn defines the number of bits needed to 
quantify the data, a decrease in the variance would theoretically allow us to code the information 
with fewer bits. Tliis would allow substantial savings in the storage of Landsat and other digital 
imagery. We have not done very much research on looking at the feasibility of a coding scheme 
to exploit this information, but recommend that some attempt be made in this direction. Finally, 
since estimates of <f>i and di are fairly easy to calculate (Nelson, 1973), both the filtering and re- 
coding are excellent candidates for on-board satellite processing. 

We wish to reiterate the need to verify our results by a more comprehensive study. This study 
should have a factorial design to satisfy concerns about the reliability of the results. 

iuis been ottt experience as well as that of Craig that for remotely sensed data (p^ is in the range 0.85 to 0,95 and (?j is in (he range 
-0.35 to -0,45, 
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